Convergence rates for density estimators of 
weakly dependent time series 



Nicolas Ragache^ and Olivier Wintenberger^ 

^ MAPS, Universite Rene Descartes 45 rue des Saints-Peres, 75270 Paris, France 

nicolas . ragacheSensae . f r 
^ SAMOS, Statistique Appliquee et MOdelisation Stochastique, Universite Paris 1, 

Centre Pierre Mendes France, 90 rue de Tolbiac, F- 75634 Paris Cedex 13, 

France, olivier . wintenbergerOuniv-parisl . f r 

1 Introduction 

Assume that {Xn)nez is a sequence of R'' valued random variables with com- 
mon distribution which is absolutely continuous with respect to Lebesgue's 
measure, with density /. Stationarity is not assumed so that the case of a sam- 
pled process {X^ „ = a:ft,^(i)}i<i<„ for any sequence of monotonic functions 
{hn{-))n& and any stationary process {xn)nei that admits a marginal density 
is included. This paper investigates convergence rates for density estimation 
in different cases. First, we consider two concepts of weak dependence: 

• Non-causal r/-dependence introduced in [DL99] by Doukhan & Louhichi, 

• Dedecker & Prieur's 0-dependence (see [DP04]). 

These two notions of dependence cover a large number of examples of time 
series (see section § 3). Next, following Doukhan (see [Dou90]) we propose a 
unified study of linear density estimators /„ of the form 

1 " 

fn{x) = -Y,K,nAx,X,) , (1) 

i=l 

where {i^m„} is a sequence of kernels. Under classical assumptions on {i^m„} 
(see section § 2.2), the results in the case of independent and identically dis- 
tributed (i.i.d. in short) observations Xi are well known (see for instance 
[Tsy04]). At a fixed point x e R'*, the sequence to„ can be chosen such that 

\\Ux)-f{x)\U = 0(n-p'(^P+^^) , (2) 

where ||Ar||^ ~ E\X\'^. The coefhcient p > measures the regularity of / (see 
Section 2.2 for the definition of the notion of regularity). The same rate of 
convergence also holds for the Mean Integrated Square Error (MISE) , defined 
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as J Wfnix) — f{x)\\2p{x) dx for some nonnegative and integrable function p. 
The rate of uniform convergence on a compact set incurs a logaritlunic loss 
appears. For all M > and for a suitable choice of the sequence m„, 



E 

llx 



/logn\''''/^''+^''^ 
sup \Ux)-f{x)\'^=o(^) , (3) 



and 



||x||<M 



p/{d+2p) 



sup \f„{x)-f{x)\=a.sO(^-^] . (4) 



These rates are optimal in the minimax sense. We thus have no hope to 
improve on them in the dependent setting. A wide literature deals with 
density estimation for absolutely regular or /3-mixing processes (for a defi- 
nition of mixing coefficients, see [Dou94]). For instance, under the assumption 

13^ = (^r-^-^'^IP^ , Ango Nze & Doukhan prove in [AD98] that (2), (3) and 
(4) still hold. The sharper condition < oc' entails the optimal rate 

of convergence for the MISE (see [Vie97]). Results for the MISE have been 
extended to the more general (p- and vy-dependence contexts by Dedecker & 
Prieur ([DP04]) and Doukhan & Louhichi in [DLOl]. In this paper, our aim 
is to extend the bounds (2), (3) and (4) in the 77- and ^-weak dependence 
contexts. 

We use the; same method as in [DL99] based on the following moment 
inequality for weakly dependent and centered sequences (Z„)„£z. For each 
even integer q and for each integer n > 2: 



r{<'vy,,„}, (5) 



(9-1) 

where = E\X\i and for fc = 2, . . . , g, 

n-l 

H,n=n^(r + l)'=-2Cfc(r) , 

r=0 

with 

Ck{r) := sup{|cov(Zt, • • • Zt^,Zt^^, ■ ■ ■ Z,,)W , (6) 

where the supremum is over all the ordered fc-tuples t\ < ■ ■■ <tk such that 
supi<i<fe_iti+i -ti = r. 

We will apply this bound when the ZiS are defined in such a way that 
127=1 is proportional to the fluctuation term fn{x) — E/„(x). The inequal- 
ity (5) gives a bound for this part of the deviation of the estimator which 
depends on the covariance bounds Cfe(r). The other part of the deviation is 
the bias, which is treated by deterministic methods. In order to obtain suit- 
able controls of the fluctuation term, we need two different type of bounds 



Convergence rates for density estimators of weakly dependent time series 



3 



for Ck{r). Conditions on the decay of the weak dependence coefficients give a 
first bound. Another type of condition is also required to bound Cfe(r) for the 
smaller values of r; this is classically achieved with a regularity condition on 
the joint law of the pairs {Xj,Xk) for all j ^ k. In Doukhan & Louhichi (see 
[DLOl]), rates of convergence are obtained when the coefficient 77 decays geo- 
metrically fast and the joint densities are bounded. We relax these conditions 
to cover the case when the joint distributions are not absolutely continuous 
and when the 1]- and (/)-depcndcncc coefficients decrease slowly (sub-geometric 
and Riemannian decays are considered). 

Under our assumptions, we prove that (2) still holds (see Theorem 1). Un- 
fortunately, additional losses appear for the uniform bounds. When r]r or (pr = 
0(e~"'' ) with a > and & > 0, we prove in Theorem 2 that (3) and (4) hold 
with log(n) replaced by log^'^''+^'/''(n). If rjr or 4>r = 0(r"") with a > 1, Theo- 
rem 3 gives bounds similar to (3) and (4) with the right hand side replaced by 
0(„-9P/{rf+2p+2d/(9o+d)} g^^^ o({iog*+d(„)/„go-2}p/{2p«o+d(go+2)}j^ respec- 
tively, and with go = 2 [(a — l)/2] (by definition [.t] is the smallest integer 
larger than or equal to the real number x). As aheady noticed in [DLOl], 
the loss w.r.t the i.i.d. case highly depends on the decay of the dependence 
coefficients. In the case of geometric decay, the loss is logarithmic while it is 
polynomial in the case of polynomial decays. 

The paper is organized as follows. In Section 2.1, we introduce the notions 
of rj and dependence. We give the notation and hypothesis in Section 2.2. 
The main results are presented in Section 2.3. We then apply these results 
to particular cases of weak dependence processes, and we provide examples 
of kernel Km in Section 3. Section 4 contains the proof of the Theorems and 
three important lemmas. 



2 Main results 

We first describe the notions of dependence considered in this paper, then we 
introduce assumptions and formulate the main results of the paper (conver- 
gence rates). 

2.1 Weak dependence 

We consider a sequence {Xi)i^z of IR'' valued random variables, and we fix a 
norm || • || on R''. Moreover, if h : M*^" — > M for some m > 1, we define 

\h{ai,...,au)-h{bi,...,bu)\ 

Lip {h) = sup ■'-jj^^ — -ji -r — -jp . 

(ai,...,o„)7t(6i,...,6„) ||ai - H 1- \\au - Ou\\ 

Definition 1 (jy-dependence, Doukhan &: Louhichi (1999)). The pro- 
cess {Xi)i(=.i is rj-weakly dependent if there exists a sequence of non-negative 
real numbers {rir)r>o satisfying rjr ^ when r ^ 00 and 
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\cov{h{Xi^,...XiJ,k(Xi^^^,...,Xi^^^)) \ < {uUp{h) + vUp{k))r]r , 

for all {u + v) -tuples, [ii, . . . , iu+v) with ii < ■ ■ ■ < iu ^ iu + r < iu+i < • • • < 
iu+v, and h,k G A^^^ where 

= I /i : 3u > 0, /i : M''" ^ M, Lip (/i) < oo, ||/i||oo = sup |/i(a;)|<ll . 

Remark The ry-dependence condition can be applied to non-causal sequences 

because information "from the future" (i.e. on the right of the covariance) 
contributes to the dependence coefficient in the same way as information "from 
the past" (i.e. on the left). It is the non-causal alternative to the 6 condition 
in [DD03] and [DL99]. 

Definition 2 (^-dependence, Dedecker & Prieur (2004)). Let {Q,A,F) 
be a probability space and M. a a -algebra of A. For any I & W , any random 
variable X e we define: 

^iM,X) - sup{||E(g(X)|A^) - E(<?(X))||^, .g e Au} , 

where Ai^i = {h : R''' i— » R/Lip {h) < 1}. The sequence of coefficients <i>k{r) is 
then defined by 

4>k{r) = max J sup 4>{a{{Xj;j < i}), {Xj^,. . .,Xj,)) . 

The process is (p-dependent if (j){r) = supj.^Q 4'k{r) tends to with r. 

Remark The cj) dependence coefficients provide covariance bounds. For a 
Lipschitz function k and a bounded function h, 

|cov {h {Xi^,...,XiJ,k , . . . , J) I 

<vE\h{Xi„...,X,J\Up{k)^{r) . (7) 

2.2 Notations and definitions 

Assume that {Xn)nez is an ry or ^ dependent sequence of M'' valued random 
variables. We consider two types of decays for the coefficients. The geometric 
case is the case when Assumption [HI] or [HI'] holds. 

[HI]: r]r = (e"""^') with a > and 6 > 0, 

[HI']: 4>{r) = O (e"'"''') with a > and 6 > 0. 

The Riemannian case is the case when Assumption [H2] or [H2'] holds. 

[H2]: 7jr = Oir-") with a > 1, 
[H2']: 4>{r) = 0(r-") with a > 1. 
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As usual in density estimation, we shall assume: 

[H3]: The common marginal distribution of the random variables X„, 
n e Z is absolutely continuous with respect to Lebesgue's measure, with 

common bounded density /. 

The next assumption is on the density with respect to Lebesgue's measure (if 
it exists) of the joint distribution of the pairs {Xj,Xk), j 7^ k. 

[H4] The density fj^k of the joint distribution of the pair {Xj,Xk) is uni- 
formly bounded with respect to j ^ k. 

Unfortunately, for some processes, these densities may not even exist. For 
example, the joint distributions of Markov chains X„ = G(X„_i, e„) may not 
be absolutely continuous. One of the simplest example is 

^fc = 2 i^k-i + efe) , (8) 

where {ck} is an i.i.d. sequence of Bernoulli random variables and Xq is 
uniformly distributed on [0,1]. The process {^n} is strictly stationary but 
the joint distributions of the pairs {Xo,Xk) are degenerated for any k. 
This Markov chain can also be represented (through an inversion of the 
time) as a dynamical system (r_„, . . . ,T_i,ro) which has the same law as 
{Xo,Xi, . . . ,Xn) {To and Xq are random variables distributed according to 
the invariant measure, see [BGROO] for more details). Let us recall the defini- 
tion of a dynamical system. 

Definition 3 (dynamical system). A one- dimensional dynamical system is 
defined by 

V/eeN,Tfc:=F'=(To), (9) 

where F : I ^ I, I is a compact subset of M and in this context, F'^ denotes 
the k-th iterate of the appplication F : F^ = F, F''^^ ~ F o F^ , k > 1. We 
assume that there exists an invariant probability measure hq, i.e. F{jJo) = hq, 
absolutely continuous with respect to Lebesgue's measure, and that Tq is a 
random variable with distribution fio- 

We restrict our study to one-dimensional dynamical systems T in the class J-^ 
of dynamical systems defined by a transformation F that satisfies the following 
assumptions (see [PriOl]). 

• Vfc e N, Va; e int(/), limt^o+ F''{x+t) = F^{x+) and limt^o- F^{x + t) = 
F''{x~) exist; 

• Vfc G N*, denoting D'^ = {x e mt{I), F''{x+) = x} and D'l = {x e 
int(J),i^'=(a;-) = x}, we assume A j |J (i'^IJi'^) j = 0, where A is 
the Lebesgue measure. 
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When the joint distributions of the pairs {Xj,Xk) are not assumed abso- 
lutely continuous (and then [H4] is not satisfied), we shall instead assume: 

[H5] The dynamical system (X„)„£z belongs to J^. 

We consider in this paper linear estimators as in (1). The sequence of kernels 
Km is assumed to satisfy the following assumptions. 

(a) The support of is a compact set with diameter 0(l/m^/''); 

(b) The functions x Km{x,y) and x ^ Km{y,x) are Lipschitz functions 
with Lipschitz constant O (m^^^^'^y, 

(c) For all x in the support of K„i, J Km{x, y) dy = 1; 

(d) The bias of the estimator /„ defined in (1) is of order run''^'^, uniformly 
on compact sets. 



2.3 Results 

In all our results we consider kernels Km and a density estimator of the 
form (1) such that assumptions (a), (b), (c) and (d) hold. 

Theorem 1 (L''-convergence) . 

Geometric case. Under Assumptions [H4] or [H5J and [HI] or [HI'], the 
sequence rUn can be chosen such that inequality (2) holds for all < q < 



Riemannian case. Under the assumptions [H4[ or [H5], if additionally 

• [H2] holds witha > max{l + 2/d+ {d + l)/p, 2 + 1/d) (rj- dependence), 

• or [H2'] holds with a>l + 2/d+l/p ((j) - dependence) , 

then the sequence rUn can be chosen such that inequality (2) holds for all 
0<g<go = 2r(a-l)/2]. 

Theorem 2 (Uniform rates, geometric decays). For any M > 0, under 
Assumptions [H4] or [H5] and [HI] or [HI'] we have, for all < q < +oo, 
and for a suitable choice of the sequence nin, 



sup E[Ux)]-f{x) =0{m-p/''). 



(10) 



||x||<M 



+00. 




sup \fn{x) - f{x)\ =a.s. O 
\\x\\<M 



( 



( 



1 2(5+l)/&/ \ 

log ^ " [nj 



n 



] 



p/(d+2p) 



) 
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Theorem 3 (Uniform rates, Riemannian decays). For any M > 0, un- 
der Assumptions [H4] or [H5], [H2] or [H2'] with a > 4 and p > 2d, for 
qo = 2 [(a — l)/2] and q < qo, the sequence uin can be chosen such that 

E sup |/„(x) - = O (n'^^+^^^+wi^^S+ay) , 
lkll<M ^ ^ 

or such that 

(( loe-^o +'^Cr)~) \ <'(<'o+2)+P(<io+<*A 
V / / 

Remetrks. 

• Theorem 1 shows that the optimal convergence rate of (2) still holds in the 
weak dependence context. In the Riemannian case, when a > 4, the con- 
ditions are satisfied if the density function / is sufficient regular, namely, 
if p > d + 1. 

• The loss with respect to the i.i.d. case in the uniform convergence rates 
(Theorems 2 and 3) is due to the fact that the probability inequalities 
for dependent observations are not as good as Bernstein's inequality for 
i.i.d. random variables (Bernstein inequalities in weak dependence context 
are proved in [KN05]). The convergence rates depend on the decay of the 
weak dependence coefficients. This is in contrast to the case of independent 
observations. 

• In Theorem 2 the loss is a power of the logarithm of the number of obser- 
vations. Let us remark that this loss is reduced when h tends to infinity. In 
the case of ry-dependence and geometric decreasing, the same result is in 
[DL99] for the special case 6 = 1. In the framework of (^-dependence, The- 
orem 2 seems to provide the first result on uniform rates of convergence 
for density estimators. 

• In Theorem 3, the rate of convergence in the mean is better than the 
almost sure rate for technical reasons. Contrary to the geometric case, the 
loss is no longer logarithmic but is a power of n. The rate gets closer to 
the optimal rate as go — * oo, or cquivalently a ^ oo. 

• These results arc new under the assumption of Riemannian decay of the 
weak dependence coefficients. The condition on a is similar to the condition 
on /3 in [AD03]. Even if the rates are better than in [DLOl], there is a huge 
loss with respect to the mixing case. It would be interesting to know the 
minimax rates of convergence in this framework. 

3 Models, applications and extensions 

The class of weak dependent processes is very large. We apply our results 
to three examples: two-sided moving averages, bilinear models and ex- 
panding maps. The first two will be handled with the help of the coefficients 
?7, the third one with the coefficients 0. 
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3.1 Examples of J7-dependent time series. 

It is of course possible to define ry-dependent random fields (see [DDLLLP04] 
for further details); for simplicity, we only consider processes indexed by Z. 

Definition 4 (Bernoulli shifts). Let H : Mj^ ^ M. be a measurable function. 

A Bernoulli shift is defined as X„ = H(£^n-i, i € Z) where (^i)iez is a sequence 
of i.i.d random variables called the innovation process. 

In order to obtain a bound for the coefficients {r|r]^ we introduce the following 
regularity condition on H . There exists a sequence {5^} such that 

SUpE |iJ GZ)-H (^i_jl|j|<^, j &Z)\<Sr, 

Bernoulli shifts are //-dependent with rjr = 25^/2 (sec [DL99]). In the fol- 
lowing, we consider two special cases of Bernoulli shifts. 

1. Non causal linear processes. A real valued sequence (ai)iez such that 
X^jGZ flj < OO and the innovation process define a non-causal linear 
process X„ = X^^^ai^n-i- If we control a moment of the innovations, 
the linear process (X„) is jj-dependent. The sequence {7?r}reN is directly 
linked to the coefficients {ai}i^z and various types of decay may occur. We 
consider only Ricmannian decays ai ~ O (* with A > h since results 

for geometric decays are already known. Here rjr = O {^\i\^r/2'^i^ ~ 

0{r^~^) and [H2] holds. Furthermore, we assume that the sequence (^i)i6z 
is i.i.d. and satisfies the condition |Ee^"«"| < C(l + |u|)-'', for all m G M 
and for some ^ > and C < oo. Then, the densities / and /^^fe exist for 
all j ^ k and they are uniformly bounded (see the proof in the causal case 
in Lemma 1 and Lemma 2 in [GKS96]); hence [114] holds. If the density / 
of Xq is p-regular with p > 2, our estimators converge to the density with 
the rates: 

• 7T,-p/(2p+i) in L'?-norm {q < 4) at each point x, 

• 7j-p/(2p+3/2) L9-norm (q < 4) uniformly on an interval, 

• (log** (n)/n) almost surely on an interval. 

In the first case, the rate we obtain is the same as in the i.i.d. case. For 
such linear models, the density estimator also satisfies the Central Limit 
Theorem (see [HLTOl] and [Dcd98]). 

2. Bilinear model. The process {Xt} is a bilinear model if there exist 
two sequences (ai)^^^. and (6i)ieN* of real numbers and real numbers a 
and b such that: 



/ oo \ oo 



(11) 
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Squared ARCH(oo) or GARCH(p, processes satisfy such an equa- 
tion, with b = bj = for all j > 1. Define 

><=\Mpf^aj + f^bj . 

If A < 1, then the equation (11) has a strictly stationary solution in LP 
(see [DMR05]). This solution is a Bernoulli shift for which we have the 

behavior of the coefficient rj: 

• rjr = O (c^^' ) for some A > if there exists an integer N such that 
di = bi = for i > N. 

• r)r = 0(c-^^) for some A > if = 0(e-^') and bi = ©(e"^*) with 
^ > and B >0. 

• rjr = 0{{r/ log(r)}"^) for some A > if = O(r^) and bi = 0{i-^) 
with A> I and B > I. 

Let us assume that the i.i.d. sequence {^t} has a marginal density G Cp, 
for some p > 2. The density of Xt conditionally to the past can be written 
as a function of We then check recursively that the common density 
of Xt for all t, say /, also belongs to Cp. Furthermore, the regularity of 
ensures that / and the joint densities fj^k for all j k are bounded (see 
[DMR05]) and [H4] holds. The assumptions of Theorem 1 are satisfied, 
and the estimator /„ achieves the minimax bound (2) if either: 

• There exists an integer N such that = 6i = for « > iV; 

• There exist A > and B > such that a, = 0(e~^*) and bi = 

• There exist A > 4 and B > 5 such that = 0{i and bi = 
0{i~^). Then, this optimal bound holds only for 2 < q < q{A,B) 
where q{A, B) = 2[{{B - 1) A A)/2]. 

Note finally that the rates of uniform convergence provided by Theorems 2 
and 3 are sub-optimal. 

3.2 Examples of 0-dependent time series. 

Let us introduce an important class of dynamical systems: 

Example 1. {Ti = F^{To))i^j^ is an expanding map or equivalently F is a 
Lasota-Yorke function if it satisfies the three following criteria. 

• (Regularity) There exists a grid = oq < ai • • • < a„ = 1 such as F e Ci 

and |-F'(a;)| > on ]ai_i, ai[ for each i — 1, . . . ,n. 

• (Expansivity) Let /„ be the set on which (F")' is defined. There exists 
^ > and s > 1 such that inf^g/„ |(F")'| > As". 

• (Topological mixing) For any nonempty open sets U, V, there exists no > 1 
such as F-"(C/) n y ^ for all n > no- 
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Examples of Markov chains X„ = G(X„_,_i,e„) associated to an expand- 
ing map {r„} belonging to T arc given in [BGROO] and [DP04]. The simplest 
one is = {Xk-\ + tk) /2 where the e/c follows a binomial law and is 
uniformly distributed on [0, 1]. We easily check that F{x) = 2x modi, the 
transformation of the associated dynamical system T„, satisfies all the as- 
sumptions such as Tn is an expanding map belonging to J^. 

The coefficients of ^-dependence of such a Markov chain satisfy (t){r) = 
0{e~°'^) for some a > (see [DP04]). Theorems 1 and 2 give the rate 
7I,-p/(2a'-I-i)^ the uniform rate and the almost sure rate (log'* (n)/n) 
of the estimators of the density of ijlq. 



3.3 Sampled process 

Since we do not assume stationarity of the observed process, the following 
observation scheme is covered by our results. Let (x„)„gz be a stationary 
process whose marginal distribution is absolutely continuous, let (/i„)„gz be a 
sequence of monotone functions and consider the sampled process {Xi^n}i<i<n 
defined by Xi^„ = Xh„(i)- The dependence coefficients of the sampled process 
may decay to zero faster than the underlying unoberved process. For instance, 
if the dependence coefficients of the process (x„)„gz have a Riemannian decay, 
those of the sampled process {xh„(i)} with = i2" decay geometrically 

fast. The observation scheme is thus a crucial factor that determines the rate 
of convergence of density estimators. 



3.4 Density estimators and bias 

In this section, we provide examples of kernels Km and smoothness assump- 
tions on the density / such that assumptions (a), (b), (c) and (d) of subsec- 
tion 2.2 are satisfied. 

Kernel estimators The kernel estimator associated to the bandwidth pa- 
rameter m„ is defined by: 



n 



^ ■ 1 
1=1 



We briefly recall the classical analysis for the deterministic part _R„ in this case 
(see [Tsy04]). Since the sequence {X„} has a constant marginal distribution, 

we have E[/„(x)] = fn{x) with /„(a;) = jjjK{s)f {x — s/ml!'^^ As. Let us 

assume that K is & Lipschitz function compactly supported in D C K"^. For 
p > 0, let K satisfy, for all j=j^ + ---+ with (ji , . . . , jo) e N'^: 



I- 



ifj = 0, 

■ x^a^K{xi, Xd)dxi ■■■dxd= {0 for j e {1, . . . , - 1] - 1}, 

if j= [p-l]. 
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Then the kernels Km{x,y) = mK {w}^'^{x — y)) satisfy (a), (b) and (c). 
Assumption (d) holds and if / G Cp, where Cp is the class of function / 
such that for p = \p — 1\ + c with 0<c<l,/is[p— 1] -times contin- 
uously differentiable and there exists A > Q such that y{x,y) e M'' x R'^, 
|/(rP-il)(a;)-/([p-il)(y)| <A|a;-y|^ 

Projection estimators Wc only consider in this section the case d — 1. 
Under the assumption that the family {1, x, x^, . . . } belongs to L'^{I, jj), where 
/ is a bounded interval of M and /x is a measure on 7, an orthonormal basis 
of /i) can be defined which consists of polynomials {Pq, Pi, P2, . . . }. We 
assume that / belongs to a class which is slightly more restrictive than 
the class Cp (see Theorem 6.23 p. 218 in [DSOl] for details). Then for any 
/ € Lp■{I^ fi)r]C'p, there exists a function Tij\m„ € Kn„ such that sup^.gj \ f{x) — 
7r/,TO„(a^)| = 0{m~''). Consider then the projection nm^f of / on the subspace 
Vm^ =Vect{Po }. It can be expressed as 

T^mJ{x) = £ I / P,(s)/(s)dM(s)| P^{X). 

The projection estimator of the density / of the real valued random variables 
{Xi}i<i<n is naturally defined as 

/"(^) = -T.^rnAx,Xi) = -^^P,(XOP,(a;) . 

1=1 i=l j=0 

Then E/„(a;) = 7Tm„f{x) is an approximation of f(x) in Vm„- The fact that I 
is compact and the Christoffel-Darboux formula and its corollary (see [Sze33]) 
ensure properties (a) and (b) for the kernels Km- We easily check that proper- 
ties (c) also holds. Unfortunately, the optimal rate {m^P) docs not necessarily 
hold. We then have to consider the weighted kernels K!^{x,y) defined by: 

m j 

j=0 fe=0 

where {a„i.j; m G N. < j < m} is a weight sequence satisfying J^JLo '^'■m.j = 
1 and for all j: \im.m^oo am.j — 0. If the sequence {a.mj} is such that 
is a nonnegative kernel then ||i^^||i = Jj K^^^{x, s)dfi{s) = 1 and the kernel 
Km satisfies (a), (b) and (c). Moreover, the uniform norm of the operator 
f^^m* fix) is sup||/||^=i \\K^ * /lloo = = 1- The linear estimator 

built with this kernel is 

i=l j=0 k=0 

and its bias has the optimal rate: 
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< „ * (fix) - 7rf,mJ{x)) + TTf^mJix) - f{x)\ , 

<{\\K^JU + l)m-^ = 0{m-n. 

Such an array {am,j} cannot always be defined. We give an example where it 
is possible. 

Example 2 (Fejer kernel). For the trigonometric basis {cos(nx), sin(na;)}„gN, 
we can find a 27r-periodic function / e such that supj.£[_^.^] |/(a;) — 
T^mfix)] = 0{m~^ logm). The associated estimator reads: 

-j^ 1 " ™" 

fn{x) = - — I 7 7 cos(fcXi) cosffca;) + sinffcX,) sin(A;a;) . 

i=i k=i 



We remark that E/„ is the Fourier series of / truncated at order m„: 

/.27I 

277 



Dm„ f{x) = ^ I f{t)Dra„ {X -t)dt. 



where 



sin(a;/2) 

is (the symmetric) Dirichlet's kernel. Recall that Fejer's kernel is defined as 

m — l m — 1 / I I I \ -2/ /r»\ 

Akx sm {mx/2) 



Fm{x) = -Y.D,{X)= J2 ( 1-^)6^'=^ = 



^t^, fe=4^-i)V msin^(a./2) 

The kernel is a nonnegative weighted kernel corresponding to Dirichlet's 
kernel and the sequence of weights am,i = 1/m and satisfies (a), (b) and (c). 
The estimator associated to the Fejer's kernels is defined by 

^ 1 " ™" 1 

fn{x) = 1 > > > cos kXi COS kx + sin kXi sin kx , 

27r riTT ■f-' m„ f-f 
1=1 j=i fe=i 

If the common density / is 27r-periodic and belongs to C'l, then assumption 
(d) holds. 

Using general Jackson's kernels (see [DSOl]), we can find an estimator such 
that Rn = 0{m^^^'^) for other values of p, but the weight sequence am,j highly 
depends of the value of p. 

Wavelet estimation Wavelet estimation is a particular case of projection 
estimation. For the sake of simplicity, we restrict hte study to = 1. 

Definition 5 (Scaling function [Dou88]). A function (p G L^(K) is called 
a scaling function if the family ~ k) ; k € Z} is orthonormal. 
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We choose the bandwidth parameter m„ = 2-'(") and define Vj = Vect{0j^fc, k G 
Z}, where (pjj. ~ 2-'/^(/)(2^ (x — fc)). Under the assumption that (/> is compactly 
supported, we define (the sum over the index k is in fact finite): 



oo n 



fc=— oo i=l 



The wavelets estimator is of the form (1) with K{x, y) = X]fc^-oo 4>{y^k)(f>{x— 
k) and Km{x,y) = mK{mx,my). Under the additionnal assumption that 
SfcGZ '^(•^ — fc) = 1 for almost all x, we can write: 



E(/„(x)-/(a;)) 



< 



j KmAy^x)f{y)dy- f{x) 

j mnK{mny,rnnx){f{y) - f{x))dy 

j mnK{mnX + 1, mnx){f{x + t/mn) - f{x))dt 

If is a Lipschitz function such that / (j){x)x^dx = if < j < [p — 1] and 
/ (l){x)x^P~^'^ dx ^ 0, then the kernel satisfy properties (a), (b) and (c). If 
f & Cp, then Assumption (d) holds. 



4 Proof of the Theorems 

The proof of our results is based on the decomposition: 

Ux) - fix) = Ux) - E {Ju{x)) +E (/„(x)) - f{x) . (12) 

FL„ (a:)— fluctuation bias 

The bias term is of order rrin''^'^ by Assumption (d). We now present three 
lemmas useful to derive the rate of the fluctuation term. 

Lemma 1 (Moment inequalities). For each even integer q, under the as- 
sumption [H4] or [H5] and if moreover one of the following assumption holds: 

• [HI] or [HI'] holds (geometric case); 

• [H2[ holds, nin = log(n)''' with 5 > 0, 7 e M and 

( ^ ((7 - 1)<5(4 + 2/(i) ^ 1 
a > max U - 1, — — , \,. — 2 



g-2 + (5(4-g) ' d) ' 



[H2'] holds, rUn = n" log(n)'*' with 5 > Q and 7 e M and 

( ^ (g - 1)(5(2 + 2/d) ^ 1 
a>ma.(g-l /^_^;^^^^_y ,l + - 
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Then, for each x gW^, 

limsup (n/m„)''/^ ||FL„(a;)||« < +00 . 

n — >oo 

Lemma 2 (Probability inequalities). 

• Geometric case. Under Assumptions [H4] or [H5] and [HI] or [HI'] there 
exist positive constants Ci , C2 such that 

P {\FLn{x)\ > evW^) Ci < exp{-C2e''/(''+i)} . 

• Riemannian case. Under Assum,ptions [H4] or [H5], if nin = n^log(n)''' 
and if one of the following assumtions holds: 

- [H2] with a > max{l + 2{5 + l/d)/{l -S),2 + 1/d}, 

- [H2'j with a > max(l + 2{l/rf(l - 5)}, 1 + 1/d), 
then, 

P {\FLr,(x)\ > e^mn/n^ < Ce"* , 

with go = 2 \{a- l)/2]. 

Lemma 3 (Fluctuation rates). Under the assumptions of Lemma 2, we 
have for any M > 0, 

• Geometric case. 



./^log 
V n 



sup \FL^{x)\ =a.s. O ( ^/::^log(''+i)/''(n) 

||x||<M 



Riemannian case. 



1+2/go \ 1+37™ 



sup \FLr,ix)\ =a..s. O , log n 

lkll<M \ V ^/90 

with go = 2 \{a-l)/2\. 
Remarks. 

• In Lemma 1, we improve the moment inequality of [DLOl], where the 
condition in the case of coefficient ry is a > 3(g — 1), which is always 
stronger than our condition. 

• In the i.i.d. case a Bernstein type inequality is available: 



FLr,{x)\ > < Ci exp (-Cze^ 



Lemma 2 provides a weaker inequaUty for dependent sequences. Other 
probability inequalities for dependent sequences are presented in [DP04] 
and [KN05]. 

• Lemma 3 gives the almost sure bounds for the fluctuation. It is derived 
directly from the two previous lemmas. 
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Proof (Proof of Lemma 1 ). Let x be a fixed point in W^. Denote Zi = — 
Eu„(Xj) where = Km^{.,x)/ ^/rnZ- Then 



-.{fn{x) - Efn{x)) = ^=FLn{x) 



/rrir, 



/rrir, 



(13) 

The order of magnitude of the fluctuation FLn{x) is obtained by applying 
the inequality (5) to the centered sequence {Zi}i<i<n defined above. We then 
control the normalized fluctuation of (13) with the covariancc terms Ck{r) 
defined in equation (6). Firstly, we bound the covariance terms: 

• Case r = 0. Here ti = ■ ■ ■ = tk = i- Then we get: 

Cfc(r) = |cov {Zt, ■ ■ ■ Zt^,Zt^^, ■■■Zt,)\< 2E\Zi\'' . 
By definition of Z^: 

E\Zi\'' < 2'=E|n„(X,)|'= < 2'=||u„||^-iE|u„(Xo)| . (14) 

• Case r > 0. Ck{r) = |cov (Zj^ • • • Zt^,Zt^_^^ ■ ■ ■ Zt^) \ is bounded in differ- 
ent ways, either using weak-dependence property or by direct bound. 

- Weak- dependence bounds: 

T]- dependence: Consider the following application: 

(j)p : {xi,...,Xp) 1-^ iUn{xi) ■ ■ ■ Un{Xp)) . 

Then ||0p||oo < 2P\\unr^ and Lip0p < 2^ ||w„||P,-iLip w„. Thus by 
?7-dependence, for all A; > 2 we have: 



Cfc(r) < (p2fh„||^-i+ ik-p)2P-'^\\uJP^''~')UpUn7ir 



< k2^\\u. 



"Iloo "'^Lip UnTj^ . 



(15) 



(j)- dependence: We use the inequality (7). Using the bound 

E\<t>p{X^,...,Xp)\ < \\Unr-^E\Un{Xo)\ , 

we derive a bound for the covariance terms: 

Ck{r) < fc2'=||«„||^-2E|«„(Xo)|Lip«„^(r) . 
Direct bound: Triangular inequality implies for Ck{r): 

k 



(16) 



|cov {Zt^ ■ ■ ■ Zt^,Zt^^^ ■ ■ ■ Zt^) \ < 



^11 Zt 



E n 
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A=\E{Un{Xt,)-EUniXt,))---iUn{Xt,)-EUn{Xt,))\ , 

= \EUn{Xot + |E K(XtJ • • • UniXt,))\ 
fc-1 

+ ^|E«„(Xo)|'=-^ ^ |E(«„(X,.J...«„(X,,J)| . 

S=l tii<---<tie 

Firstly, with A; > 2: 

\^Un{Xot < \\Unf^^mUn{Xo)\f . 

Secondly, ifl<s<fc — 1: 

\E{un{Xt,J---Un{Xtj)\ <E\Un{Xt J... Un{XtJ\ , 

< \\Unr-^E\Un{Xo)\ , 

\Eun{Xot-' < ||w„||^-«-iE|«„(Xo)| . 

Thirdly there is at least two different observations with a gap of r > 
among Xt^ , ■ ■ ■ , Xt^ so for any integer k > 2 : 

|EK(XtJ...u„(XtJ)| < ||w„||^-2e|w„(XoK(X,)| . 

Then, collecting the last four inequations yields: 

A<||t.„||^-2(E|n„(Xo)ir 
fe-i 

+ (E|w„(Xo)|)2^C,^||u„(Xo)||^-2 + \\Un\\'^^E\Un{Xo)UniXr)\ . 
s=l 

So: 

A < ||m„||^-2 ((2* _ l){E\u,,{Xo)\f + E\Un{Xo)Un{Xr)\) . (17) 

Now, we bound Bi with i < k. As before: 

Bi = \E{Un{XtJ-EUn{Xt^))---{Un{Xt^)-EUn{XtJ)\ , 

i 

= J2\nMXo)n E \^i^n{XtJ---Un{Xtj)\ , 

S=0 *31<---<*Ji. 

<2^||«„||-2(EK(Xo)|f . 

Then: 

Bp X Bk-p < 2'=||w„||^-4(E|w„(Xo)|)4 < 2'=||«„||^-2(E|«„(Xo)|)2 . 

(18) 

Another interesting bound for r > follows, because according to 

inequalities (17) and (18) wc have: 

Ck{r) < \\un\\t^ ((2'=+! - l){E\Un{Xo)\f +E\un{Xo)u^iXr)\) . 
Noting 7„(r) = E|w„(Xo)u„(X^)| V (E|u„(Xo)|)2, we have: 

Cfc(r) <2'=+i||«„||^-V(r). (19) 
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We now use the different values of the bounds in inequalities (14), (15), (16) 
and (19). If we define the sequence (wr)o<r<n-i as: 

• wa = 1, 

II CO LipM„?7r A E|u„(Xo)|Lip u„(^(r), 
then, for all r such that < r < n — 1 and for all A: > 2: 

Ck{r) < A;2'=||u„||^-V • 
We derive from this inequality and from (5): 

9 f / n-1 \ n-1 



q 



VnX:(r + l)«-C,(r) 



r=0 / r=0 

(n-\ \ ,u II s q-2 n-1 



The symbol ^ means < up to an universal constant. In order to control Wr, 
we give bounds for the terms 7n(r) = ]E|u„(Xo)u„(Xr)| V (E|u„(Xo)|)2: 

• In the case of [H4] , we have: 

E\Un{Xo)Un{Xr)\ < SUp j, \\fj.k\\ 

{E\u,,{x,)\r<\\f\a\un\\i. 

• In the case of [H5], Lemma 2.3 of [PriOl] proves that E|u„(Xo)'Un(Xr)| < 
(E|w„(Xo)|)2 for n sufficiently large and the same bound as above remains 
true for the last term. 

In both cases, we conclude that 7n(?") ^ ll^nlli- The properties (a), (b) and 

(c) of section 2.2 ensures that ||w„||f ^ , ||u„||ooLip w„ ^ m^"'"^/'^ and 

E\uniXo)\UpUn ^ ml/'''. We then have for r > 1: 

Wr^—A A ml/'^4>r . (20) 

TTln 

In order to prove Lemma 1, it remains to control the sums 

II X fe-2 n-1 
Mn||.sc. \ 

r=0 

for = 2 and k = q in both Riemannian and geometric cases. 

• Geometric case. 

Under [HI] or [HI']: We remark that a A & < a^foi-" for all a G [0; 1]. 
Using (20), we obtain first that :< {rjr A ^r)"'^n^^^^^''^~^^~"^ for n 



^(r + l)'=-V, (21) 
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sufficiently large. Then for < a < 2^pj- we bound Wr independently of 

m„: Wr ^ {rjr A <f>r)"- For all even integer A: > 2 we derive from the form 
of rir A that (in the third inequality u = ar^): 

n—l n~l 



r=l r=0 

POO 

^ / r*""^ exp(-aor'')dr 
Jo 



1 roo 

1 / fc-1 1 

^ — ^rr / exp(— u)au 



ba b 



fc-1 



Using the Stirling formula, we can find a constant B such that, for the 
special cases k — 2 and k = q: 



r=l 

• Riemannian case 



Under [H6] and [H2]: Let us recall that [H6] implies that m„ < for 
n sufficiently large and < 5 < 1 and that the assumption of Lemma 1 
implies that: 

a > max [q-l, „ ' — 2 + - 
V ' g- 2 + 5(4- g) ' d 

Then, we have a > max ( k — 1, — 1)(4 + •^/'^) \ both cases k = q 

V fc - 2 + (5(4 - fc) y 

or A; = 2. This assumption on a implies that: 

{k + 2/d)S + 2- k (4-fc)(5 + fc-2 
2(a- fc + 1) ^ 2(A;- 1) " 

Furthermore, reminding that < 6 < 1: 

(4-fc)5 + fc-2 ^ fc(l + <^)_45 

2(fc-l) 2(fc-l) - ■ 

We derive from the two previous inequalities that there exists (k €]0, 1[ 

{k + 2/d)S + 2-k (4-fc)(5 + fc-2 

"^"^^"^S 2(a-fc + l) < < 2(fc-l) • 
For fc = (7 or fc = 2, we now use Tran's technique as in [ABD02]. We divide 
the sum (21) in two parts in order to bound it by sequences tending to 0, 
due to the choice of (i~ : 



Convergence rates for density estimators of weakly dependent time series 19 

^Ck]k-l 



k-2 [ri'^fc]-! , I , fe-2 



r—O 



^ ^(2a(fc-l)-((4-fe)5+fe-2))/2 

= 0(1) , 



fe-2 n-l , , ^ fc-2 



n / \ V Ti 

r=[n''fc] 

< ^(-2Cfe(a-fe-l)+((fe+2/d)5+2-fc))/2 

= 0(1) • 

Under [H6] and [H2']: Under the assumption of Lemma 1: 

/ ^ 5{q-l){2 + 2/d) ^ 1 

we derive exactly as in the previous case that there exists Cfe G]0; 1[ for 
k = q ov k = 2 such that 

(fc - 2 + 2/d)6 + 2-k {4-k)5 + k-2 

2(a- fc + 1) ^ ^'^ ^ 2(A;- 1) ' 

We then apply again the Tran's technique that bound the sum (21) in that 
case. 

Lemma 1 directly follow from (13). □ 

Remarks. Wc have in fact proved the following sharper result. There exists 
a universally constant C such that 

" II PT ( \\\<i ^ / (O?)'' in the Riemaniann case, , , 

—j ll^^n(a;)||q < I (^^1+1/6^)9 in the geometric case. ^ > 

Proof (Proof of Lemma 2), The cases of Riemannian or geometric decay of 
the dependence coefficients are considered separately. 

• Geometric decay We present a technical lemma useful to deduce expo- 
nential probabilities from moment inequalities at any even order. 

Lemma 4. If the variables {Vn}nei. satisfies, for all k gW 

WVnhk < ^{2k) , (23) 
where </> is an increasing function with 0(0) = 0. Then: 
P(|Kx|>e)<e2exp(-<^-i(e/e)) . 
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Proof. By Markov's inequality and Assumption (23), we obtain 

With the convention 0° = 1, the inequality is true for all k gN. Reminding 
that ^(0) = 0, there exists an integer ko such that (p{2ko) < e/e < (ji(2(A;o + 
1)). Noting the generalized inverse of (p, we have: 



< e^exp(-(/)-^(e/e)) . 



2feo 



-2feo _ „2 -2(feo + l) 



□ 

b+l 



We rewrite the inequality (22): J-^FLn < (f>(2k) with (j){x) = Cx 

V ™" 2k 

for a convenient constant C. Applying Lemma 4 to = ^J^FLn we 
obtain: 



i^in|>ey^ <e^exp(-<A-'(e/e)) 



and we obtain the result of the Lemma 2. 
• Riemannian decay In this case, the result of Lemma 1 is obtained only 
for some values of q depending of the value of the parameter a: 

- In the case of 77-dependence: 

/ l + (5 + 2/d „ 1 
a > max I g - 1, — ^ ,2 + - 

- In the case of (^-dependence: 

1 



a>ma.(, -1,1 + ^^,1 , ^ 

We consider that the assumptions of the Lemma 2 on a are satisfied in 
both cases of dependence. Then qo = 2 [^^] is the even integer such that 

a — 1 < go < a + 1. It is the largest order such that the assumptions 
of Lemma 1 (recalled above) are verified and then the Lemma 1 gives us 

/ n Y"'"^ 

directly the rate of the moment: lim sup ) ||i<l„(x)||^° < +00. 

ri^oc \mn J 

We apply Markov to obtain the result of Lemma 2: 

FLr.{x)\ >^^—)< • 

□ 
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Proof (Proof of Lemma 3). We follow here Liebscher's strategy as in [AD03]. 
We recover B := B{Q,M), the ball of center and radius M, by at least 
{4:Mn + lY balls Bj = B{xj, Then, under the assumption that ifm(-) v) 
is supported on a compact of diameter proportional smaller than we 
have, for all j: 

m ^/'^ ~ 

sup \FLn{x)\ < \fniXj) - E/„(x,-)| + C^^(|/„(x,) - E/„(x,)| (24) 
xeBj M 

+2|E/„(a;,)|) , 

with C a constant and fnix) = ^J27=i'^rn„ix,Xi) where Km„ is a ker- 
nel of type Km{x,y) = Kamk{x,y,Xj,l/m^^'^). The l/&-Lipschitz function 
k{x,y,a,b) is equal to 1 on B{a,b) and null outside B{a,b+ l/b). The con- 
stant Ko is fixed in order that Kmn satisfies properties (a), (b) and (c) of 
section 2.2. Then using (24) and with obvious short notation: 




sup \FLr,{x)\ >eJ—]< > P sup |FL„(x)| > e 
\M<M \ n '-^ \^eBj 



< [AM II + l)'' 

l/d 



sup P \FLn{x 





I l/d 

+P(2C^|E/„(a;,-)|>e 

Using the fact that / is bounded, E/„ = / Km„ixj,s)f{s)ds is bounded 

independently of n. Then taking fj, = mn'^~^^^n^/'^L{n)/e ensures that 

P ( 2C —jj—\Efn {xj ) I > 1 is null for n sufficiently large. Applying Lemma 

2 on / and /, uniform probability inequality in both cases of geometric and 
Riemannian decays become: 

P ( sup \FL^{x)\ > en^f^] ^ /exp ( -Ce^ ) , (25) 

\\\x\\<M 



\\<M 




sup \FK{x)\ > enJ— ^ An'" ■ (26) 



In the geometric case, fix e„ as G{logn)'^''~^^^^'' such that the bound be- 
comes iJL'^n~'^^ . Reminding that /x < n, the sequence iJ,'^n~'^^ , bounded by 
^d~GC ^ is summablc for a conveniently chosen constant G. Borel-Cantelli's 
Lemma then concludes the proof in this case. 
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In the Riemannian case, take e„ = {ml^'^^'^n^'^'''^'^) ™+<i logn such that the 
bound becomes n"""" log~'° ni(n). Reminding that qo > 2, this sequence is 
summable and here again we conclude by applying Borel-Cantelli's Lemma. 

□ 



Proof of the theorems 

The order of magnitude of the bias is given by Assumption (d) and the Lem- 
mas provide bounds for fluctuation term. There only remain to determine the 
optimal bandwidth m„ in each case. 

Proof (Proof of Theorem 1). Applying Lemma 1 yields Theorem 1 when q is 
an even integer. For any real q, Lemma 1 with 2{\q/2 \ + 1) > 2 and Jensen's 
inequalities yields: 

„ \ 9/2 / ^ \ 9/2 

— E|FL„(x)|« 



rur, 



/ / „ X [9/21+1 \ 

^ ((i) 



g/{2([g/21 + l)} 



Plugging this bound and the bound for the bias in (12), we obtain a bound 
for the L'-error of estimation: 

ll/n(x) - f{x)\U < \\FL^{x)\U + \Rn{x)\ = O l^^+m-"''' 

d 

The optimal bandwidth m* = n^p+d is the same as in the i.i.d. case. Thus [H6] 
holds with 5 = j^f^ ■ For this valued of 5, the conditions on the parameter a 
of Lemma 2 are equivalent to those of Theorem 1 . □ 

Proof (Proof of Theorem 2). Applying the probability inequality (25) in the 
proof of Lemma 3 and the identity E|F|« = /+°° P (|F| > t^^^) dt, we obtain 



E sup \f„(x)-fix)\'> = olL[^\og^'>+'^/'{n)y+m; 
\\x\\<M n J 

Lemma 3 gives the rate of almost sure convergence: 



qp/d 



sup Ifnix) - f{x)\ =a.s. O { \ log" H + m, 

\\x\\<M VV n 

In both cases, the optimal bandwidth is m* = (n/log^^''+^^/*(n))'^/(2p+d)^ 
which yields the rates claimed in Theorem 2. □ 
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Proof (Proof of Theorem 3). Applying the probability inequality (26) and the 
same line of reasoning as in the previous proof, we obtain 



E sup |/„(x)-/(x)|« = 
lkll<M 

where qo = 2 \{a - l)/2]. The optimal bandwidth = n'^/('i+2p+2d/(qo+d)) 
implies [H6] with 5 = d/ {d+2p+2d/ {qo+d)). For this value of 5, the conditions 
on a of Lemma 2 are satisfied as soon as a > 4 and p > 2d. 
Lemma 3 gives the rate for the fluctuation in the almost sure case. This leads 
the optimal bandwidth 




2)+pC<I0+<i) 



We then deduce the two different rates of Theorem 3, either in the almost 
sure or in the framework. □ 
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